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Abstract 

In the study of differential privacy, composition theorems (starting with the original paper 
of Dwork, McSherry, Nissim, and Smith (TCC’06)) bound the degradation of privacy when 
composing several differentially private algorithms. Kairouz, Oh, and Viswanath (ICML’15) 
showed how to compute the optimal bound for composing k arbitrary (e, 5)-differentially private 
algorithms. We characterize the optimal composition for the more general case of k arbitrary 
(ei, 5i),..., (cfc, (5fc)-differentially private algorithms where the privacy parameters may differ 
for each algorithm in the composition. We show that computing the optimal composition in 
general is ^P-complete. Since computing optimal composition exactly is infeasible (unless 
FP=^P), we give an approximation algorithm that computes the composition to arbitrary 
accuracy in polynomial time. The algorithm is a modification of Dyer’s dynamic programming 
approach to approximately counting solutions to knapsack problems (STOC’03). 


1 Introduction 

Differential privacy is a framework that allows statistical analysis of private databases while mini¬ 
mizing the risks to individuals in the databases. The idea is that an individual should be relatively 
unaffected whether he or she decides to join or opt out of a research dataset. More specifically, the 
probability distribution of outputs of a statistical analysis of a database should be nearly identical 
to the distribution of outputs on the same database with a single person’s data removed. Here 
the probability space is over the coin flips of the randomized differentially private algorithm that 
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handles the queries. To formalize this, we call two databases Dq,Di with n rows each neighboring 
if they are identical on at least n — 1 rows, and define differential privacy as follows: 


Definition 1.1 (Differential Privacy |DMNS06llDKMMN06] 'l. A randomized algorithm M is (e, S)- 
differentially private for e, (5 > 0 if for all pairs of neighboring databases Dq and Di and all output 
sets S C Range(M) 

Pr[M(Do) e S'] < e'=Pr[M(Di) G S] + S 
where the probabilities are over the coin flips of the algorithm M. 

In the practice of differential privacy, we generally think of e as a small, non-negligible, constant 
(e.g. e = .1). We view (5 as a “security parameter” that is cryptographically small (e.g. S = 
One of the important properties of differential privacy is that if we run multiple distinct 
differentially private algorithms on the same database, the resulting composed algorithm is also 
differentially private, albeit with some degradation in the privacy parameters {e,S). In this paper, 
we are interested in quantifying the degradation of privacy under composition. We will denote the 
composition of k differentially private algorithms Mi, M 2 , ■ ■ ■, Mk as (Mi, M 2 , ..., M^) where 

(Ml, M2 ,..., Mu){x) = {Mi{x),M 2 {x), ..., Mu{x)) 

A handful of composition theorems already exist in the literature. The first basic result says: 

Theorem 1.2 (Basic Composition |DKMMN06j l. For every e > 0, 5 G [0,1], and (e, S)-differentially 
private algorithms Mi, M 2 , ■ ■ ■, M^, the composition {Mi, M 2 , ■ ■ ■, Mf) satisfies {ke, k6)-differential 
privacy. 

This tells us that under composition, the privacy parameters of the individual algorithms “sum 
up,” so to speak. We care about understanding composition because in practice we rarely want to 
release only a single statistic about a dataset. Releasing many statistics may require running mul¬ 
tiple differentially private algorithms on the same database. Composition is also a very useful tool 
in algorithm design. Often, new differentially private algorithms are created by combining several 
simpler algorithms. Composition theorems help us analyze the privacy properties of algorithms 
designed in this way. 

Theorem |1.2| shows a linear degradation in global privacy as the number of algorithms in the 
composition (fc) grows and it is of interest to improve on this bound. If we can prove that privacy 
degrades more slowly under composition, we can get more utility out of our algorithms under the 
same global privacy guarantees. Dwork, Rothblum, and Vadhan gave the following improvement 
on the basic summing composition above |DRV10] . 

Theorem 1.3 (Advanced Composition |DRV10j ~). For every e > 0, 5, (5' > 0, fc G N, and {e,S)- 
dijferentially private algorithms Mi, M 2 , ■ ■ ■, Mk, the composition {Mi, M 2 , ■ ■ ■, Mk) satisfies {cg, k5-\- 
S')-differential privacy for 


Theorem 


1.3 


Cg = \/2khi{l/5') • e -I- fc • e • (e*^ — 1) 

shows that privacy under composition degrades by a function of 0{yJk\n{l/5')) 
which is an improvement if 5' = . It can be shown that a degradation function of 11(1/^ ln(l/5)) 

is necessary even for the simplest differentially private algorithms, such as randomized response 
|Wa,r65] . 
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Despite giving an asymptotically correct upper bound for the global privacy parameter, Cg, 
Theorem 1 1.3 1 is not exact. We want an exact characterization because, beyond being theoretically 
interesting, constant factors in composition theorems can make a substantial difference in the prac¬ 
tice of differential privacy. Furthermore, Theorem |1.3| only applies to “homogeneous” composition 
where each individual algorithm has the same pair of privacy parameters, (e, (5) . In practice we 
often want to analyze the more general case where some individual algorithms in the composition 
may offer more or less privacy than others. That is, given algorithms Mi, M 2 ,..., we want 
to compute the best achievable privacy parameters for (Mi, M 2 ,..., M^). Formally, we want to 
compute the function: 


OptComp(Mi,M 2 ,.. .,Mk,Sg) = inf{eg > 0: {Mi, M 2 , ■. ■,Mk) is (eg,5g)-DP} 

It is convenient for us to view 5g as given and then compute the best e^, but the dual formulation, 
viewing Cg as given, is equivalent (by binary search). Actually, we want a function that depends 
only on the privacy parameters of the individual algorithms: 


OptComp((ei, 5i), (£ 2 , ^ 2 ), • ■ •, (efc, 4), = sup{OptComp(Mi, M 2 ,. 


.., Mfe, 5g): Mi is (e^, (5i)-DP Vi G [A:]} 


In other words we want OptComp to give us the minimum possible tg that maintains privacy 
for every sequence of algorithms with the given privacy parameters {ei,5i). A result from Kairouz, 
Oh, and Viswanath |KOV15| characterizes OptComp for the homogeneous case. 


Theorem 1.4 (Optimal Homogeneous Composition |KOV15Q . For every e > 0 and S G [0, 1), 

OptComp((e, 5), (e, (5),..., (e, 5),Sg) equals the least value of Cg > 0 such that 
^ ^ ^ 
k 


1 

(1 -k e<^)^ 





< 1 - 


1 - 4 

(1 - 4'^- 


Empirically (see Appendix |A|, this optimal bound provides a 30-40% savings in Cg compared 
to Theorem 1.3 (and a 20% savings compared to an improved asymptotic bound from |KOV15| l. 
The problem remains to find the optimal composition behavior for the more general heterogeneous 
case. Kairouz, Oh, and Viswanath also provide an upper bou nd fo r heterogeneous composition that 
generalizes the 0{i/kln{l/W)) degradation found in Theorem : 
do not comment on how close it is to optimal. 


1.3 


for homogeneous composition but 


1.1 Our Results 

We begin by extending the results of Kairouz, Oh, and Viswanath |KOV15| to the general hetero¬ 
geneous case. 

Theorem 1.5 (Optimal Heterogeneous Composition). For all ei,... ,ek > 0 and 4 , • • ■ , 4i 4 ^ 
[0, 1), OptComp((ei, (5i), (£ 2 ,4); ■ • • i {^k, 4); 4 ) the least value of Cg >0 such that 


E 


]\i=i (1 + sc{i,....fc 


max 


fc} 



,0^ < 1- 


1-4 

nti(i-^o 


(1) 


^The phrasing of Theorem |l.4| is not exactly how it is presented in |KOV15] (which only refers to eg of the form 
(k — 2i)e for integer z), but this version can be deduced from the original. 
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Theorem |1.5| exactly characterizes the optimal composition behavior for any arbitrary set of 
differentially private algorithms. It also shows that optimal composition can be computed in time 
exponential in k by computing the sum over S C {1,..., fc} by brute force. Of course in practice 
an exponential-time algorithm is not satisfactory for large k. Our next result shows that this 
exponential complexity is necessary: 

Theorem 1.6. Computing OptComp is ^P-complete, even on instances where 5i = 82 = ■ ■ ■ = 
Sk = 0 and J 2 i^[k] ^ ^ desired constant e > 0 . 

Recall that is the class of counting problems associated with decision problems in NP. 
So being ^P-complete means that there is no polynomial-time algorithm for OptComp unless 
there is a polynomial-time algorithm for counting the number of satisfying assignments of boolean 
formulas (or equivalently for counting the number of solutions of all NP problems). So there is 
almost certainly no efficient algorithm for OptComp and therefore no analytic solution. Despite 
the intractability of exact computation, we show that OptComp can be approximated efficiently. 

Theorem 1.7. There is a polynomial-time algorithm that given rational ei,..., > 0, (5i,... 5^, (5^ S 

[ 0 , 1 ), and rj G ( 0 , 1 ), outputs e* satisfying 

OptComp((ei,(5i),...,(efc,4),(5g) < e* < OptComp((ei, (5i),..., (cfe, 4), '4) + ^ 


The algorithm runs in time 


O 


k^-e-{l + e) 


log 


(1 + 4 


where e = ^*7^; assuming constant-time arithmetic operations. 

Note that we incur a relative error of rj in approximating Sg and an additive error of 77 in 
approximating Cg. Since we always take Cg to be non-negligible or even constant, we get a very 
good approximation when rj is polynomially small or even a constant. Thus, it is acceptable that 
the running time is polynomial in I/ 77 . 

In addition to the results listed above, our proof of Theorem |1.5| also provides a somewhat simpler 
proof of the Kairouz-Oh-Viswanath homogeneous composition theorem (Theorem 1.4 |KOV15| b 
The proof in |KOV15j introduces a view of differential privacy through the lens of hypothesis 
testing and uses geometric arguments. Our proof relies only on elementary techniques commonly 
found in the differential privacy literature. 


Practical Application. The theoretical results presented here were motivated by our work on an 
applied project called “Privacy Tools for Sharing Research Data’|^ We are building a system that 
will allow researchers with sensitive datasets to make differentially private statistics about their 
data available through data repositories using the Datavers^ platform |Crolll IKin07| . Part of 
this system is a tool that helps both data depositors and data analysts distribute a global privacy 
budget across many statistics. Users select which statistics they would like to compute and are 
given estimates of how accurately each statistic can be computed. They can also redistribute their 
privacy budget according to which statistics they think are most valuable in their dataset. We 
implemented the approximation algorithm from Theorem |1.7| and integrated it with this tool to 
ensure that users get the most utility out of their privacy budget. 

^ privacy tools .seas.harvard.edu 

^dataverse.org 
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2 Technical Preliminaries 

A useful notation for thinking about differential privacy is defined below. 

Definition 2.1. For two discrete random variables Y and Z taking values in the same output space 
S, the 5-approximate max-divergence of Y and Z is defined as: 

; Pr[y eS]-6 ' 

Prizes] j 

Notice that an algorithm M is (e, S) differentially private if and only if for all pairs of neighboring 
databases, Dq^Di, we have Df^{M{Do)\\M{Di)) < e. The standard fact that differential privacy 
is closed under “post processing” |DMNS0^ IDR13| now can be formulated as: 

Fact 2.2. If f: S ^ R is any randomized function, then 

Dl{f{Y)\\f{Z))<Dl{Y\\Z) 

Adaptive Composition. The composition results in our paper actually hold for a more general 
model of composition than the one described in the introduction. The model is called fc-fold adaptive 
composition and was formalized in |DRV 10] . We generalize their formulation to the heterogeneous 
setting where privacy parameters may differ across different algorithms in the composition. 

The idea is that instead of running k differentially private algorithms chosen all at once on a 
single database, we can imagine an adversary adaptively engaging in a “composition game.” The 
game takes as input a bit b G {0,1} and privacy parameters (ei, i5i),..., (e^, 5fc). A randomized 
adversary A, tries to learn b through k rounds of interaction as follows: on the zth round of the game, 
A chooses an (e^, 5i)-differentially private algorithm Mi and two neighboring databases q), 

A then receives an output yt = where the internal randomness of Mi is independent of 

the internal randomness of Mi,..., Mi-i. The choices of Mi, D(^i Qp and may depend on 

2 / 0 ) • ■ •, Vi-i as well as the adversary’s own randomness. 

The outcome of this game is called the view of the adversary, which is defined to be 
( 2 / 1 ,... , 2 /fe) along with A’s coin tosses. The algorithms Mi and databases o),D(i 1 ) from each 
round can be reconstructed from . Now we can formally define privacy guarantees under fc-fold 
adaptive composition. 

Definition 2.3. We say that the sequences of privacy parameters ei,..., > 0, (5i,..., <5^ S [0,1) 

satisfy (cg, 5g)-differential privacy under adaptive composition if for every adversary A we have 
D^{y^\\V^) < Cg, where represents the view of A in composition game b with privacy parameter 
inputs (ei,(5i),..., (efe,4)- 


Di.(X\\Z) =max 


Computing real-valued functions. Many of the computations we discuss involve irrational 
numbers and we need to be explicit about how we model such computations on finite, discrete 
machines. Namely when we talk about computing a function / : {0,1}* —>■ K, what we really mean 
is computing / to any desired number q bits of precision. More precisely, given x,q, the task is to 
compute a number y GQ such that \f{x) — y\ < ^. We measure the complexity of algorithms for 
this task as a function of |a;| + q. In order to reason about the complexity of OptComp, we will 
also require that the inputs be rational. So when we talk about computing OptComp exactly, we 
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actually mean given ei,..., efc > 0 ,Si,... ,Sk,Sg G [0,1) all rational and an integer q, compute e* 
such that: 


where is the true optimal parameter with full precision. 


3 Characterization of Opt Comp 


Following |KOV15j . we show that to analyze the composition of arbitrary (ei,5i)-DP algorithms, it 
suffices to analyze the composition of the following simple variant of randomized response |War65] . 


Definition 3.1 f |KOV15] l. Define a randomized algorithm ■ {Oj 1} {Oj 1; 2, 3} as follows, 

setting a = 1 — 5: 


Pr[M(,,,)(0) = 0]=,5 
Pr[M(,,,)(0) = l]=a-^ 
Pr[M(,,,)(0) = 2]=a-^ 
Pr[M(,,,)(0) = 3] =0 


Pr[M(,, 5 )(l) = 0] = 0 
Pr[M(,^5)(l) = 1] = a • 
Pr[MM)(l) = 2] = a-4^ 
Pr[M(,,5)(l) = 3] = 5 


Note that M(e.5) is in fact (e, 5)-DP. Kairouz, Oh, and Viswanath showed that M(e,5) can be 
used to simulate the output of every (e, (5)-DP algorithm on adjacent databases. 


Lemma 3.2 ([KOVTs]). For every {e,S)-DP algorithm M and neighboring databases Do,Di, there 
exists a randomized algorithm T such that T{M(^ s)(b)) is identically distributed to M(Dh) for 5=0 
and 5=1. 


For the sake of completeness, we provide a self-contained proof of this lemma, which does not 
use the hypothesis testing and geometric arguments in |KOV15| . Specifically, we give an explicit 
construction of the simulator, T in two steps. First we introduce a slight generalization of 
called and an algorithm T' that can use to simulate every differentially private 

algorithm on adjacent databases for some So, <5i < S. Then we show how to simulate M(e,So,Si) using 
with an algorithm called T”. The construction will look like: 




3.2 


will be T = T' o T". Before introducing M(^^^So,Si) and T' we 


Then the T needed for Lemma 
define some additional notation. 

Given an (e, 5)-DP algorithm M with output space R and neighboring databases Do,Di, let 
Po,Pi be the probability mass functions of M{Do) and M{Di), respectively. The definition of 
differential privacy tells us that for all sets SCR: 


Po{S) - e^PiiS) < S 
Pi{S) - e^PoiS) < S 


The left-hand side of the first inequality is maximized by S' = 5'o for 


So = {r G R: Po{r) > e^Pi{r)} 


( 2 ) 
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and the left-hand side of the second inequality is maximized by 

51 = {r G R: Pi(r) > e''Po{r)} 


( 3 ) 


Define as 


50 = Po{So)-e^Pi{So)<S (4) 

51 = Pi{Si)-e^Po{Si)<S (5) 


We will show how to simulate M using the following algorithm. 


Definition 3.3. Define M(^^ So,Si) ■ {Oj 1} {Oj 2,3} as follows, with 6 q, 5i as defined in Equations 
and for some (e, (5)-DP algorithm and setting oq = 1 — «! = 1 ~ 


= 0] = (5o 

Pr[M(,,^„,,^)(0) = 1] = 
Pr[MMoA)(0) = 2] = 
Pl'[^(e,<5o,5i)(0) = 3] = 0 


Pr[M(,,5o,5,)(l)=0] = 0 
Pr[MM„A)(l) = l] = ^^§“ 
Pr[M(,,,„,,^)(l) = 2] = 
Pr[M(,,5„,5,)(l)=3] = 5i 


Notice that if (5o = = <5 then M(e,So,Si) = We need to show that M(e,So,Si) is composed 

of a valid probability distribution. Since ab = 1 — 5b, 


^ Pr[M(e,5o,5i)(&) = a;] = 1 for 6 = 0,1 

a:G{0.1,2,3} 


To see that all of the terms are non-negative we need to show that the recurring terms e'^oi — oq 
and 6*^00 — oi are non-negative and the rest follows by inspection. 

Lemma 3.4. For every {e,S)-DP algorithm, M with output space R and neighboring databases Dq 
and Di, e'^Oi — Oq and e'^ag — ai are non-negative where oq = 1 ~ cn = 1 — (5i and (5o, are 
defined in Equations^ and^ 

Proof. 


ai = l-Pi(5i)+e^Po(5i) 

<Pi(5o) + e^-(l-Po(5o)) 

<e2^Pi(5o) + e'-(l-Po(5o)) 

= 6*^00 

The other inequality follows by symmetry. □ 

Now we show how to use simulate any (e,(f) differentially private algorithm. 

Lemma 3.5. For every (e,S)-DP algorithm M with output space R, and every pair of neighboring 
databases, Dq, Di, there exists So, Si < 5 and a randomized algorithm T' : {0,1,2, 3} —>■ P such that 
i){b)) is identically distributed to M{Db) for 6 = 0 and 6=1. 

Proof. Fix neighboring databases, Dq, Di and let Pg, Pi be the probability mass functions of M on 
Do,Di, respectively. We will use So,Si,5o, and (5i as defined above in Equations 13 mill and H 
Fix r G R. T''. {0,1, 2, 3} —>■ P is defined in the table below. 
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X 

Pr[T'(x) = r], r € Sq 

Pr[r'(x) = r], r € 

PrlT'(x)=r],r€R\So\Si 

0 

P(Po(r)-e^Pi(r)) 

0 

0 

1 

(e--l)Pi(r-) 
e^an —CKi 

0 

e'Po(r)-Pi(r) 
e^an — a^ 

2 

0 

(e--l)Po(r) 
e^a^ —an 

e'Pi(r)-Po(r) 
e^a^ —an 

3 

0 

^(Ri(r) - e^Po(r)) 

0 


We need to show that T'{x) is a valid probability distribution for each x. All of the terms are 


non-negative because e^ai — oq and 6*^00 ~ cti are non-negative by Lemma 3.4 


The sums of Pr[T'(0) = r] and Pr[T'(3) = r] are immediate from the definitions of 5q and (5i, 
respectively: 


^ Pr[T'(0) = r] = 1 ^ (Po(r) - e^PiW) + 0 + 0 = 1 
rdR ° r-GSo 


A symmetrical argument works for Pr[T'(3) = r]. We now analyze the sum for Pr[T'(l) = r]. The 
sum for Pr[T'(2) = r] follows by symmetry. We use the following identities: 


ao = 1 - - e"Pi(r)) = ^ e"Pi(r) + ^ Po(r) + ^ Po{r) 

r&So r&So rGSi reR\So\Si 

ai = 1 - ^(Pi(r) - e'^PoW) = ^ Pi(r) + ^ e'^Po(r) + ^ Pi(r) 

rGSi rGSo rGSi reR\So\Si 

Thus: 

e"ao - ai = ^ - l)Pi(r) + ^ (e"Po(r) - Pi(r)) 

rGSo reR\So\Si 

This implies ^ Pr[r'(l) = r] = 1. Now we just need to show that T'{M(^^^Sg^Si){b)) is identically 

r^R 

distributed to M{Dt,). We will show this for 6 = 0 and the 6=1 case follows by symmetry. Fix 
r G R. By the definition of 

Pr[r'(M(,.,„,,,)(0)) = r] = 6o-Pr[T'(0) = r]+•Pr[r'(l) = r]+(^ ^'g ~ ■Pr[r(2) = r] 

From here we break the calculation into the three possible cases: 


Case 1: r e 


Pr[P'(M(,,,„,5,)(0)) 


r] = 60 • ^(Po(r) - e'^Pi(r)) + 


~ e’^ai 


p2e 


- 1 


= Po(r) - e''Pi(r) + e"Pi(r) = Po(r) 


{e'^^ - l)Pi(r) 


Case 2: r € 5i 


Pr[T'(M(,,,„,^,)(0))=r] 


e'^oi - oo (e^'" - l)Po(r) 


Po{r) 


- 1 


e'^oi — oo 





















Case 3: r G R \ So \ Si 


Pr[T'(%,,„,,,)(0)) = r] 


e^^ao — e^ai e^Po{r) — Pi{r) e^ai — ao e^Pi{r) — Po{r) 


o2e 


- 1 


e^ao — ai 


o2e 


- 1 


e'^ai — ao 


e'^Toir) - e'^Pi(r) + e'^Pi(r) - Po{r) 
- 1 


= Po{r) 


□ 

We have shown how a generalization of called M(^^^So, 5 i) can be used to simulate the output 

of every differentially private algorithm. In the next lemma we show how to simulate using 

M(j , 5 ), which implies that can be used to simulate the output of every differentially private 

algorithm by composing the simulator introduced in Lemma |3.5| with the one introduced below. 

Lemma 3.6. For every e > 0 and (5o, ^i, ^ G [0,1) sueh that • (1 — (5o) > 1 — and • (1 — 5i) > 
1 — ^0 o.nd So, Si < S, there exists a randomized algorithm T” sueh that T"(M(g 5 )( 6 )) is identically 
distributed to for both 6 = 0 , 1 . 

Proof. Assume without loss of generality that (5o > (5i and set a = 1 — 5, oq = 1 — (^O; and ai = 1 —(5i. 
We will represent T”{M(^f S){b)) as a Markov Chain below. Here, the probability of transitioning 
from one state to another is proportional to the weight of an edge. That is, the true probability 
along an edge leaving some node a is the weight divided by the sum of the weights of all of the 
edges leaving a (this is just to avoid cluttering the diagram with the normalizing denominators). 
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b 




T"(Af(,.5)(6)) 



All of the weights are non-negative because oi > ao > ct, e'^ai > ao, and p is also at most 1, 
which we verify now: 

(ao ~ ck) ■ (e'^ao — ai) < {ai — a) • (e^'^ao — ai) 

< (ai — a) • (e^'^ao — Oq) 

= (ai — a) • ao • — 1) 

We need to show that T"(M(£ , 5 )( 5 )) is identically distributed to , 5 ^ , 5 j)( 6 ) for & = 0 and 
5=1, which will complete the proof. Notice that Pr[T"(M(g_ 5 )( 0 )) = 3] = 0 = Pr[M(£^ 5 g_ 5 j)( 0 ) = 3] 
because there is no path from the 5 = 0 node to the T" = 3 node. Similarly, Pr[T"(M (5 ,5)(1)) = 
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0] = 0 = = 0] We also have: 

Pr[r"(%,,)(0)) = 0] = 


5 + a) \5o + (<5 — (5o) 

1 ■ 5 
= <5o 

= Pr[M(£_5g_5j)(0) = 0] 


Similarly, 


Pr[r"(M(,,^)(3)) = 3] = 


<5i 


5 + a) \5i + ((5 — (5i) 

_ 5 

“ 1 ■ T 

= -5i 

= Pr[M(,,5Q_5^)(3) = 3] 

Next we check the probabilities with which T" outputs 1 and 2 when 6=0. 


Pr[T"(%,)(0)) = l] = 6 . 


ao — a 


e qq — ^1 


= (oo — a + a) 


Q!o(e'' - e '") 
e^'^Ofo ~ e^Q-i 
- 1 ) 


+ a ■ 


e^'^ao — e^ai 
- 1 

= Pr[%,5„A)(0) = l] 


e qq — 0^1 


6*^ + 1/ \Q;o(e'^ — 1) 


It follows that Pr[r"(M(g_ 5 )( 0 )) = 2] = Pr[M(g_ 5 g_, 5 ^)( 0 ) = 2] because the probabilities sum to 1. 
Finally we show the probabilities with which T" outputs 1 and 2 when 6=1. 




a • 


/ e^^gp - ai 
e'^ + lj \ao(e'^-l) 


"(“0"“ + “)’ Vgo(e2^-l) 

= Pr[MM„A)(l) = l] 


6 QIq — OL\ 


Again because the probabilities sum to 1, it follows that Pr[T"(M(g , 5 )( 1 )) = 2] = Pr[M(j , 5 ^ )(!) = 
2 ], which completes the proof. □ 


So can simulate any (e, 6 ) differentially private algorithm. Since it is known that post¬ 

processing preserves differential privacy (Fact 2.2), it follows that to analyze the composition of 
arbitrary differentially private algorithms, it suffices to analyze the composition of 
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Lemma 3.7. For all ei,... ,€k > 0,Si,... ,Sk,Sg G [0,1), 

OptComp((ei, 5i),..., (cfc, 6k), Sg) = OptComp(M(jj_ 5 j),..., M(^^^^Sk),6g) 

Proof. Since ■ ■ ■, ^{ek,Sk) '^i)i • ■ • > (efc, ^fe)-difTerentially private, we have: 

OptComp((ei,(5i),..., {ek,Sk),6g) = sup{OptComp(Mi,.. .,Mk,Sg): Mi is (ei,(^i)-DP Vi G [A:]} 

> OptComp(M(,^,5^),.. .,M(^^^^Sk),6g) 

For the other direction, it suffices to show that for every Mi,..., Mk that are (ei, (5i),..., {ek,6k)- 
differentially private, we have 

OptComp(Mi, ...,Mk,5g)< OptComp(M(^^_ 5 ^),..., M(ek,Sk)) 

That is. 


inf{eg > 0: {Mi,...,Mk) is (eg,<5g)-DP} < inf{eg > 0: (Mt^^uSi), ■ ■ ■, M(^^^^Sk)) is (eg,^g)-DP} 


So suppose ..., is (e^, (^g)-DP. We will show that (Mi,..., Mk) is also (cg, (5g)-DP. 

Taking the infimum over Cg then completes the proof. 

We know from Lemma |3.2| that for every pair of neighboring databases Dq, Di, there must exist 
randomized algorithms Ti,... ,Tk such that Ti{M(^^.^Si){b)) is identically distributed to Mi{Db) for 
all i G {1,..., fc}. By hypothesis we have 


((Mp,.5,)(0), • ■ ■, M(,,,,,)(0 ))||(M(,,a)( 1), ..., Mp,.5,)(l))) < 


Thus by Fact |2.2| we have: 

D^^{{Mi{Dii),..., Mk{Do))\\{Mi{Di),..., Mk{Di))) = 

((ri(M(,,,5,)(0)),... ,rfe(M(,,,,,)(0)))||(Ti(M(,,,)(!)),.. . ,Tk{M^,,,Sk)m) < ^g 


□ 


Now we are ready to characterize OptComp for an arbitrary set of differentially private algo¬ 
rithms. 

Proof of T/teore m |1.5[ Given (ci, 5i),..., (e^, Sk) and 6 g, let M^{b) denote the composition 
(M(e,,j,)(&), ■ ■ ■, M(^^ s ^){b) ) and let P^{x) be the probability mass function of M'^{b), for b — 0 
and 6=1. By Lemma [3.7[ OptComp((ei, 61 ),..., (ck, 6 k), (5g) is the smallest value of Cg such that: 


<^g > max {P'=(Q) - . Pi^iQ),Pi\Q) - (Q)} . 

Since M is symmetric, we can instead consider the smallest value of Cg such that: 


6 g > max { P^ (Q) - • P(= (Q)} 

" QC{0,1,2,3}'= 


QC{0,1,2,3}'= 

without loss of generality. Given Cg, the set S C {0,1,2,3}^ that maximizes the right-hand side is 


P = P(eg) = {x G (0, l,2,3f I P^ix) > • Pf (x)} 
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We can further split S{eg) into S{eg) = <S'o(eg) U S'i(eg) with 


5o(eg) = {a:G{0,l,2,3}'=|P^(cr) = 0} 

Si{eg) = {x& {0,1, 2, I P^{x) > • P'^ix), and P^{x) > 0} 

Note that S'o(eg) H 5'i(eg) = 0. We have (S'o(eg)) = 0 and Pq (S'o(eg)) = 1 — Pr[M^(0) G 

{1,2,3}'=] = So 


Po'=(^(eg)) - P^P^iSPg)) = Po"(5o(eg)) - P^P^iS^Pg)) + P^(^i(eg)) - P^P'^{SPeg)) 

k 

= 1 - [](! - S^) + P^iS^ieg)) - P^P^iSPcg)) 

i^l 

Now we just need to analyze i}}(S'i(eg)) — e'^»P{=(S'i(eg)). Notice that 5'i(eg) C (1,2}'= because 
for all X G S'i(eg), we have Po(a:) > Pi{x) > 0. So we can write: 

PoHSiieg)) - ■ P,^{SPeg)) 


= max 

yG{l,2}'^ 


n tr - Oi)C ■ ^ -p-j- (1 - Sj) _ 

l + li l + 

i : Ui — l i : Vi—2 


E 

ye{o,i}'' 


(1 - (5^ 

E fc 

i = l 

E fc 


max 


E k 

i=i 


TT (1 - ^i) TT (1 - 5i)P'' 

li i + e<=" ' J-J- l + e*^- ’ 

i : yi — l i : yi —2 


,0 


Putting everything together yields: 


<5g > Po''(^o(eg)) - e^«P{=(,So(eg)) + P^iSiPg)) - P^P^{SPeg)) 






E' 


E' 


2=1 


max < , 0 

nEi(l + e-),c{t,j I 


□ 

We have characterized the optimal composition for an arbitrary set of differentially private 
algorithms (Mi,..., MP) under the assumption that the algorithms are chosen in advance and all 
run on the same database. Next we show that OptComp under this restrictive model of composition 
is actually equivalent under the more general fc-fold adaptive composition discussed in Section 

Theorem 3.8. The privacy parameters ei,..., > 0, (5i,..., G [0,1), satisfy (cg, 5g)-differential 

privacy under adaptive composition for Cg, Sg > 0 if and only j/OptComp((ei, (5i),..., (e^, Sp, 5g) < 

^9 

Proof. First suppose the privacy parameters ei,..., Ck, 5i, ■ ■ ■, 5k satisfy (eg, (5g)-differential privacy 
under adaptive composition. Then OptComp((ei, (ji),..., (e^, Sk), 5g) < Cg because adaptive com¬ 
position is more general than the composition defining OptComp. 
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Conversely, suppose OptComp((ei, (5i),..., (efc, Sk), 5g) < Cg. In particular, this means 
OptComp(M(gj_ 5 j),..., , 5 ^.), 5g) < eg. To complete the proof, we must show that the privacy 

parameters ei,..., e^, (5i,..., (5fc satisfy (cg, (5g)-differential privacy under adaptive composition. 

Fix an adversary A. On each round i, A uses its coin tosses r and the previous outputs 
2 / 1 ,... ,2/i-i to select an (e^, (5^)-differentially private algorithm Mi = i and neighboring 

databases Dq = , Di = L 0 t yb j^g |;]^g view of A with the given privacy 

parameters und er composition game b for b = 0 and 6 = 1 . 

Lemma 3.2 tells us that there exists an algorithm Ti = such that Ti{M(^^.^Si){b)) 

is identically distributed to Mi{Dh) for both 6 = 0,1 for all i G [A:]. Define T(zi,..., Zk) for 
zi,... ,Zk € { 0 , 1 , 2 ,3} as follows: 


1. Randomly choose coins r for A 

2. For t = 1,..., fc, let y^ G- 

3. Output (r, 2 /i, ■ ■ ■ ,2/fc) 

Notice that ..., is identically distributed to for both 6 = 0,1. By 

hypothesis we have 

{iMi.uS.m, ■ ■ . ,M(e.A)(0))||(M(,,.^,)(l), • ■ • , %..50(1))) < ^9 
Thus by Fact |2.2| we have: 

(f(M(,,,,,)(0),... ,%„5,)(0))||f(%,,,,)(!),..., M(,,,5,)(1))) < Cg 

□ 


4 Hardness of Opt Comp 

is the class of all counting problems associated with decision problems in NP. It is a set of 
functions that count the number of solutions to some NP problem. More formally: 

Definition 4.1. A function /: (0,1}* —>■ N is in the class #P if there exists a polynomial p: N —>■ N 
and a polynomial time algorithm M such that for every x G {0,1}*: 

fix) = ||y G {0, 1}P(I“I): M{x,y) = l|| 

Definition 4.2. A function g is called ij=P-hard if every function / G ij^P can be computed in 
polynomial time given oracle access to g. That is, evaluations of g can be done in one time step. 

If a function is ^P-hard, then there is no polynomial-time algorithm for computing it unless 
there is a polynomial-time algorithm for counting the number of solutions of all NP problems. 

Definition 4.3. A function / is called ^P-easy if there is some function g G #P such that / can 
be computed in polynomial time given oracle access to g. 

If a function is both ^P-hard and #P-easy, we say it is #P-complete. Proving that computing 
OptComp is #P-complete can be broken into two steps: showing that it is #P-easy and showing 
that it is #P-hard. 
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Lemma 4.4. Computing OptComp is ifP-easy. 

Proof. For convenience we will view rational (ei, i5i),..., (efc, ^fc) and eg as given arguments to 
OptComp and compute 5g. Recall that the two versions of OptComp, viewing Cg as given and 
computing Sg and vice versa, are equivalent up to a polynomial factor (just run binary search over 
values of Sg computing polynomially many bits of precision). So the formulation we choose for the 
proof will not affect whether OptComp is in ffP or not. Recall that in our model of computing real 
valued functions, we will take another input q and we will output an approximation of Sg to q bits 
of precision in polynomial time using a if P oracle where Sg satisfies the following: 


E 


E' 


ni=i (1+6'^’) sc{i,..., 


max • 


— e 


jigs 


,0 = 1 - 




fc} 




Notice that the only part of the expression above that cannot be computed in polynomial time is 
the summation over subsets of {1,..., k}. If we knew the sum, computing Sg would be easy given 
our inputs. We show how to compute the sum in polynomial time using a ifP oracle and it follows 
that computing Sg is #P-easy . 


Define /: 2^ —>• M as f{S) 


max 



E- 

gigs ^0 


/ is computable in polyno¬ 


mial time (to any desired precision). Let f he a function computable in polynomial time where 
f{S) — f{S) < for all S. Set m = 10'^. Now define the function g: 21^1 x N —>■ {0,1} as 
follows: 

1 if TO • f{S) > n 
0 otherwise 


g{S,n) = 


We can now phrase a decision problem in NP: Does there exist a pair (S', n) such that g(S, n) = 1? 
This is in NP because given a witness (S, n), we can compute to • /(S) and compare the output to 
n, thereby verifying the solution, in polynomial time. Since this is an NP problem, a ifP oracle 
can count the number of solutions to it in one time step. Notice that for every set S, the number of 
solutions (pairs of the form (S, n) satisfying g{S, n) = 1) is exactly to • /(S) because g will output 1 
for g{S,l),g{S,2),... ,g{S,m-f{S)). So over all possible sets S, the number of solutions as counted 
by the ifP oracle equals to • Esc[fc] fi^)- Dividing this by to gives us the sum up to an additive 

error of which can be used to compute Sg to q bits of precision in polynomial time. This 

only required one call to a ifP oracle. So computing OptComp is #P-easy. □ 


Next we show that computing OptComp is also ifP-hard through a series of reductions. We 
start with a multiplicative version of the partition problem that is known to be ^P-complete by 
Ehrgott |Ehr00j . The problems in the chain of reductions are defined below. 

Definition 4.5. #INT-PARTITION is the following problem: given a set Z = {zi, 2 : 2 ,..., Zk} of 
positive integers, count the number of partitions P C [fc] such that 


Wzi-Y\zi = h 
ieP i^P 
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All of the remaining problems in our chain of reductions take inputs {wi,..., Wfc} where 1 < 
Wi < e is the Dth root of a positive integer Zi for all i G [k] and some positive integer D. All of the 
reductions we present actually hold for every positive integer D, including D — 1 (in which case 
the inputs are integers). However, we will constrain D to be large enough so that our inputs are in 
the range [l,e]. This is because in the final reduction to OptComp, values in the proof are set 
to ln(u>i). We want to show that our reductions hold for reasonable values of e’s in a differential 
privacy setting so throughout the proofs we use WiS G [l,e] to correspond to e^’s G [0,1] in the 
final reduction. In fact, we will later state our reductions as applying to instances where Hi < e" 
(and hence ^ e) for rmy desired e > 0. 

Definition 4.6. ^^(^PARTITION is the following problem: given a number D G N and a set 
W = {wi, ui 2 ,..., Wk} of real numbers where 1 < Wi,... ,Wk < e are Dih roots of positive integers 
zi,... Zk, respectively, count the number of partitions P C [fc] such that 

I]^ Wi - Wi = 0 
iGP i^P 

(The real numbers Wi,... ,Wk are specified in the input by Zi,..., and D with the input size 
being the combined bit length of these integers in binary). 

Definition 4.7. ^^(^T-PARTITION is the following problem: given a number D G N, a set W = 
{wi,W 2 , ..., Wk} of real numbers and a positive real number T where 1 < wi,..., Wfe < e are Dth 
roots of positive integers zi,... Zfe, respectively, and T = — ^\/t' for two integers count 

the number of partitions P C [fc] such that 

n Wj -1]^ =T 

ieP i^P 

(The real numbers wi,... ,Wk and T are specified in the input by zi,..., Zk,t,t' and D with the 
input size being the combined bit length of these integers in binary). 

Definition 4.8. SUM-PARTITION: given a number D G N and a set W = {ryi,u> 2 , •. ■, w^} of 
real numbers where 1 < Wi,... ,Wk < e are Dth roots of positive integers zi,... z^, respectively, 
and a rational number r > 1, find 


max 

PC[fc] 


Wi - r • Wi, 0 
iGP i^P 


(The real numbers wi,... ,Wk are specified in the input by zi,...,Zfc and D with the input size 
being the combined bit length of these integers and the numerator and denominator of r in binary). 

Since the output of SUM-PARTITION is irrational, the actual computational problem is defined 
according to our convention in Section for computing real-valued functions. That is, given an 
additional input q, compute a number y such that 

2/ - ^ max < Wi - r • Wi, 0 
PC[fc] I iGP i^P 
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We prove that computing OptComp is #P-hard by the following series of reductions: 


#INT-PARTITION < #PARTITION < #T-PARTITION < SUM-PARTITION < OptComp 

Since ^^^INT-PARTITION is known to be #P-complete |EhrOO] . the chain of reductions will 
prove that OptComp is #P-hard. 

Lemma 4.9. For every eonstant c > 1, ^PARTITION is ^P-hard, even on instances where 
Hi w, < c. 

Proof. Given an instance of ^^^INT-PARTITION, {zi, ..., Zk}, we show how to find the solution 
in polynomial time using a ^PARTITION oracle. Set D = |"logg(ni Wi = ^/zl \/i G [k]. 

Note that J|- Wi = (Hi < c. Let PC [k]: 

ViGP / \i^P 

n =n 

ieP i^P 

There is a one-to-one correspondence between solutions to the ^PARTITION problem and solu¬ 
tions to the given ^j^INT-PARTITION instance. We can solve ^j^INT-PARTITION in polynomial 
time with a ^^^PARTITION oracle. Therefore ^PARTITION is #P-hard. □ 

Lemma 4.10. For every constant c > 1, #T-PARTITION is ffP-hard, even on instances where 
Hi w, < c. 

Proof. Let c > 1 be a constant. We will reduce from ^j^PARTITION, so consider an instance of the 
^PARTITION problem, W = {wi,W2, ■. ■ ,Wk} of Pth roots of integers Zi,... ,Zk, respectively. 
We may assume Wi < \fc since \fc is also a constant greater than 1. 

Set W = IT U {wk+i}, where Wk+i = 11^=1 Notice that — (v^)^ = c. Set 

T = y/Wk+i {wk+i — 1). Notice that Wk+i = ° so by setting integers t = (nLi ^i) and 

t' = n?=i ^0 got that 

T= ^^t- 

which meets the input requirement for #T-PARTITION. So we can use a #T-PARTITION 
oracle to count the number of partitions (5C{l,...,fc-|-l} such that 

n - I]^ Wi = r 

iGQ i^Q 

Let P = Q n {1,..., fc}. We will argue that JlieQ ~ Ili^Q Wi = T if and only if Higp = 
rii^p which completes the proof. There are two cases to consider: G Q and Wfc+i ^ Q. 


n 


Wi = Wi 
iGP i^P 
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Case 1: Wk+i G Q. In this case, we have: 



multiplied both sides by Wi 

ieP 

factored quadratic in Wi 

ieP 


n =n 

i^P ieP 

So there is a one-to-one correspondence between solutions to the ^j^T-PARTITION instance 
W' where Wk+i G Q and solutions to the original ^j^PARTITION instance W. 


Case 2: Wfc+i ^ Q. Solutions now look like: 


iGP i^P y iG[k] / 

One way this can be true is if = 1 for all i G [k]. We can check ahead of time if our input 
set W contains all ones. If it does, then there are 2^ — 2 partitions that yield equal products (all 
except P = [k] and P = 0) so we can just output 2^ — 2 as the solution and not even use our oracle. 
The only other way to satisfy the above expression is for Jliep > ni6[fc] which cannot happen 
because PC [fc]. So there are no solutions in the case that Wk+i ^ Q- 

Therefore the output of the ^T-PARTITION oracle on W' is the solution to the ^PARTITION 
problem. So #T-PARTITION is #P-hard. □ 

For the next two proofs we will make use of the following fact to bound the amount of precision 
needed when approximating irrational numbers by rational ones in our reductions: 

Fact 4.11. For all real numbers y > x and functions f that are differentiable on the interval [x,y]: 

f{y)-f{x)>{y-x)- min f{z) 

ze{x,y) 

Lemma 4.12. For every constant c > 1, SUM-PARTITION is ffP-hard even on instances where 
rii Wi < c and where there are no partitions S such that Jlies Wi = r ■ Wi- 

Proof. We will use a SUM-PARTITION oracle to solve ^^(^T-PARTITION given a set W = 
{rci,..., Wk} of Pth roots of positive integers zi,..., Zk, respectively, and a positive real number 
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T = for integers t,t' given in the input. Notice that for every a; > 0: 

-r-r riiGffe] 

11^ 


n Wi - U Wi = a; 

ieP i^P 


Wi - 


ieP 


n 


= X 


iGP 


3 j G Z'''such that \/j — — = x 

yj ^ 


Above, j must be a positive integer greater than j which tells us that the gap in 

products from every partition must take a particular form. This means that for a given D and 
W, #X-PARTITION can only be non-zero on a discrete set of possible values of x. So given 
our #T-PARTITION instance we can find a T' > T such that the above has no solutions for 
X in the interval (T, T'). Specifically, solve the above quadratic for If j is not an integer, 

then we know the answer to the ^j^T-PARTITION instance is 0, so assume j is an integer and set 

+ 1. We can also find an interval {T",T) just below T where no value 
of X in the interval can yield a solution above by setting T" = — 1 — Y\^'Wi/ — 1. We use 

these discreteness properties twice in the proof. Also notice that these intervals are not too small: 


Claim 4.13. T' — T > 2 and T — T” > 2 where n is the input length (i.e. the bit 

lengths of the integers zi,..., Zk,t, t'). 


Proof of Claim. 


T'-T= Vj + 1 - 


n*e[fc]^* 


Vl+i 

> W+i- Vj 

1 




> 


D{j + 1) 


where the last inequality follows from Fact |4.11| This final value is only exponentially 
small because j is upper bounded by 11^=1 which is at most exponentially large in 
the bit length of the zfs. A very similar proof shows that (T", T) is only exponentially 
small. □ 


This means that we can always find f G (T, T') such that f is rational and can be fully specified 
with a bit length that is polynomial in the input length. Fix such a quantity T. For all y > 0, 
define = {P C [k] \ ~ Wi^p'^i — v}- Then, since ^-PARTITION has no solutions for 

a; G (T,r): 


I ^ 1 n Wi - Wi = T 1 

= 

pT\pf 

[ iGP iiP J 




\pepT\pT \ieP i^P J j 


1 

T 



n 


Wi - Wi 
i^P 


E 

pgpT yisp 
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We now show how to compute the two sums in the final term using the SUM-PARTITION 
oracle. We will give the procedure for computing ^ 11 ~ 11 with T 

PeP'r \ieP i^P J 

will follow by symmetry. The oracle returns a real number, so by our model of computing real 
valued functions, we will also give the oracle an additional input that specifies the number of bits 
of precision in its output. Ultimately we only need to approximate each sum to within ±r/4. This 
will give an approximation to the #T-PARTITION problem to within ±1/2, thereby solving it 
by rounding the approximation because the solution will be an integer. We want to set the input 
r to the SUM-PARTITION oracle to be r = ry such that for all PC [fc], we have: 


Wi - rr • Wj > 0 <;=^ Wi - Wi > T (6) 

ieP i^P ieP i^P 


Taking w = riieffe] thinking of u = JliGP suffices that all positive solutions to each 

of the following two inequalities are the same: 


V — rT— > 0 and v -> T 

V V 

The positive solutions to the left one are v > y^rpw, and to the right one are v > {T+\/T'^ + Aw)/2. 
Setting the right-hand sides equal gives 


tt = 


{t+VWT^Y 


Aw 


(7) 


Since rx might be irrational and SUM-PARTITION takes as input rational values of r, we 
need to find a rational r that approximates rx and preserves the set of solutions P^. Recall from 
Claim 4.13 that there is an (only) exponentially small interval (T",T) below T such that for all 
T G {T”,T), P"^ = P^. This translates to a corresponding interval (rT's^r) such that for all 
r S {rx",rx), equivalence (|^ holds. Furthermore, this interval is also only exponentially small. 


Claim 4.14. rx — rx" ± 2 where n is the input length (i.e. the bit lengths of the integers 

Zi,...,Zk,t,t'). 


Proof of Claim. To see this, view rx from Equation]^ as a function r{T) of T, and 
calculate the derivative: 


r'(T) = + 

2w ■ y/T^ + Aw ’ 

Fact 14.11] says that: 


rx - rx" = r{T) - r{T”) 

>( min r{z)]-{T-TY 
\zeiT",T) ' V ' 

> (T - P") • poly(P) 

(Recall that 1 < ic = ]/[^ Wi < c). This is only exponentially small in the input length 
by Claim [4T^ □ 
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So we can choose a rational r G {tt" , tt) that can be specified with a number of bits that is poly¬ 
nomial in the input length and preserves ^ I 11 i^pWi>o|. However 

the SUM-PARTITION oracle gives us 


max < 


- ?' • n f 

= E 1 


- r • Wi 

PC[fc] 

[iGP 

i^p ] 

Pepr 

\iGP 

i^p ) 


whereas we want to compute the right-hand side without the r coefficient. To get this we just pick 
another rational r' € (rT",rT) such that r' — r > If precision were not an issue, we could 

run our SUM-PARTITION oracle for r and r' and receive the output: 


Si= ^ Wi - r • Wi 

P^PT \ i^P i^P 


82 = ^ j Wi - r' • 

PGP^ yiGP i^P 

Then the following linear combination of Si and S 2 gives us what we want: 


E in»*-n 

P^PT \i£P i^P 


Wi 


Claim 4.15. Computing Si and S 2 to within ±2 P°^y('^'> yields an approximation ofY^p^pr (^IliGP ~ lli^p 
to within ±T/4. 

Proof of Claim. We just need to approximate S'! and S 2 to within to get the 

desired precision. This additive error is only exponentially small by Claim [4.14[ □ 

Running this whole procedure again for T G (T,T'), which we fixed above gives us all the 
information we need to count the number of solutions to the ^j^T-PARTITION instance we were 
given. We can solve ^j^T-PARTITION in polynomial time with four calls to a SUM-PARTITION 
oracle. Therefore SUM-PARTITION is ffP-hard. □ 

Now we prove that computing OptComp is ^P-complete. 

Proof of T/teorem |1.6[ We have already shown that computing OptComp is ffP-easy. Here we 
prove that it is also ^P-hard, thereby proving ^P-completeness. 

We are given an instance D, W = {wi,... ^Wk},r € Q, and q of SUM-PARTITION, where 
Vi € [fc], Wi is the Pth root of a corresponding integer Zi, Jli— C and q specifies the desired 
number of bits of precision in the output. If we disregard precision, we would like to set = 

In(wi) Vi G [fc], (5i = ^2 = • ■ • = 0 and eg = ln(r). Note that Ci = In (Jj[- Wi) < ln(c). Since we 

can take c to be an arbitrary constant greater than 1, we can ensure that < e for an arbitrary 

e > 0. 


21 






Again we will use the version of OptComp that takes Cg as input and outputs Sg. After using 


an OptComp oracle to find Sg we know the optimal composition equation from Theorem 1.5 
satisfied: 


IS 


1 ^ f S- ^ 

—- > max < ,0 

UlA^ + e^')sct.M I 


= 1 - ^ — = 


Thus we can compute: 


^ ( X/ '' 

'^5 ■ n (1 + ^ 1 ® 

i=l SC{U...,fc} I > 


= ^ max < ]^ Wi - r • Wi,0 

SC{l,...,fc} lies i^S 


This last expression is exactly the solution to the instance of SUM-PARTITION we were 
given. Taking precision into account, the input SUM-PARTITION instance has an additional 
input q that specifies the desired number of bits of precision in the output and we can only pass 
OptComp rational values so we will have to approximate = ln(i(;i) for all i and Cg = ln(r). 
Again there is a worry that when we approximate these values the set of partitions S that make 
~ > 0 might change. We want to get enough precision in our inputs so that 

the set of partitions over which we sum does not change and enough precision so that the output 
is accurate to q bits. We will calculate the approximations required for each of these two goals 
separately and the final precision that we use will just be the maximum of the two. We prove that 
we can achieve both of these goals with the next two claims. 


Claim 4.16. There exists a polynomial p(n) in the length n of the input (the bit lengths of 
Zi,..., Zk,q, and the numerator and denominator of r) such that if \wi — w'^\ < for each 

i, then the set of partitions S satisfying 


Wi - r- • iCi > 0 
les lis 

is the same as the set of partitions satisfying 

w' - r • n > 0 

les 

Proof of Claim. Recall that SUM-PARTITION is #P-hard even on instances where 
there are no partitions S such that iCi so we may assume our input 

instance of SUM-PARTITION has no such partitions and still prove the hardness of 
OptComp. So to ensure that we have enough precision such that the set over which 
we sum does not change, we must make the error smaller than the minimum possible 
(in absolute value) nonzero outcome of ggWi - r • n i^gWi- We now bound this 
quantity. Let 

5 = J S' C [fc] I w, 7^ ic, i 
I ies i^S I 
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Since r is rational, r = ajb for two integers a and b. Let a' = 


and b' ^b°. 


Then: 


min 


Wi - r • Wi 
iGS i^S 


= min 
ses 



> min 

ses 




ies 


i^S 


^ (riiGife] 


(D-l)/D 


Where the last line follows from Fact 
\{d-i)/d _ 


4.11 


1 / n 


liG[fc] ) 


applied to the function f{x) = 
is only exponentially small because IliGifc] most exponen¬ 


tially large in the bit length of the integers zi,..., Zfe. We claim that 
is at least 1/6' for all S G S. Fix S £ S: 


riiGS b' rii 




Z^ 


n ^ n 

= h ^ 

b' 'Wz^- a' ■'^Zi 

iGS i^S 


ies i^S 


= h-b' 


h > 1/6' 


Where the last implication follows because 6' • JliGS zi-a’- is just a difference 

of integers so the closest nonzero value it can take on is ±1. □ 

Claim 4.17. There exists a polynomial p(n) in the length n of the input (the hit lengths of 
zi,... ,Zk,q, and the numerator and denominator of r) such that if |wi — iy'| < for each 

i, then 


max 

.T[w'i-r 


— max 


- r • 1 

< 2-« 

SC{l,....fc} 

[i&S 

iiS ] 

SC{l,...,fc} 

[ies 

) 



Proof of Claim. We will choose p{n) = Pi{n) + P2{n) where Pi{n) is the polynomial 


that exists from Claim 4.16 and P 2 {n) will be determined later. Define 


5'+ = S' C [/c] I - r • Wi > 0 

zGS i^S 


Claim [06] says that: 


S'+ = <1 S C [/c] I w' - r • w' > 0 

iGS i^S 
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Now we can write 


sc{i,...,fc} lies 


Hie--r-r[w-,0 > - max i Jl Wi - r • ]J-u;i,0 > = 


sc{i,...,k} lies 


ses+ Vies i^s / ses+ Vies i^s 


E - E ^ 

ses+ Vies ies / ses+ Vi^s i^s / 


E 

ses+ Vies ies 


E ^ 

ses+ V*^s i^s 


Bounding each term in the final expression above by then gives us the accuracy 

we want. We will show directly how to bound the second term and the argument for 
the first term follows symmetrically. By hypothesis we have that for all SC [k]: 


i^S i^S 

< n (i+ 




and similarly 


(i 


It follows that for all SC [k]: 


-1 y n Wi < [ n “ n ^ 




Since 15+1 <2^ and 1 < Jli^s — ^ ^ 


2'^t- y 1 - 2-p("y - y. < ^ r- yi+2-p(")y-y 

' ses+ \i^s i^s j ^ ' 

Picking P 2 (n) such that p{n) = pi{n) + P 2 {n) > 2k + log(rc) + g + 1 then suffices to 
bound the absolute value of the sum by 2“’^'?+^V Repeating the same calculation for 
5Zses+ (^^es< -n*e gWi) will yield the same approximation except without the 
factor of r. So we can bound both terms by 2“^'^+^) (and therefore their sum by 2“"?) 
by approximating each Wi to a precision that is polynomial in n, which proves the 
claim. □ 
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So by the two claims above we can get an approximation of the SUM-PARTITION instance 
to q bits of precision in polynomial time with access to an OptComp oracle. Therefore computing 
OptComp is #P-hard. □ 


5 Approximation of OptComp 


Although we cannot hope to efficiently compute the optimal composition for a general set of dif¬ 
ferentially private algorithms (assuming Py^NP or even FPy^ #?), we show in this section that we 
can approximate OptComp to arbitrary precision in polynomial time. 

Theorem |1.7| (restated). There is a polynomial-time algorithm that given rational ei,... ,e/c > 
0, (5i,... (5fe, dg G [0,1), and rj G (0,1), outputs e* satisfying 

OptComp((ei,<5i),...,(efc,4),(5g) < e* < OptComp((ei, 4), • ■ ■, (efc, 4), ’4) + ^ 


The algorithm runs in time 

where e = J2i^[k] assuming constant-time arithmetic operations. 

We prove Theorem |1.7| using the following three lemmas: 

Lemma 5.1. Given non-negative integers ai,...,afc, B and weights Wi,...,Wk G Q, one can 
compute 

SCffc] s.t. iGS 
i€S 

in time 0{Bk). 

Notice that the constraint in Lemma |5.1| is the same one that characterizes knapsack problems. 
Indeed, the algorithm we give for computing X]sc[fc] IliGS ^ slight modification of the known 
pseudo-polynomial time algorithm for counting knapsack solutions, which uses dynamic program¬ 
ming. Next we show that we can use this algorithm to approximate OptComp. 


Lemma 5.2. Given a rational e'^° with cq > 0 and ei = ai • eo,...,efc = Ofc • eo,e* = a* ■ cq 
for positive integers ai,... ,ak,a* (given as input), and rational (5i,...4,4 ^ there is an 

algorithm that determines whether or not OptComp((ei, di),. .., (e^, 4), 4) — ^^d runs in time 

oik- o-i ) assuming constant-time arithmetic operations. 


In other words, if the e values we are given are all integer multiples of some eo where e'^° is 
rational, we can determine whether or not the composition of those privacy parameters is (a* -cq, Sg)- 
DP in pseudo-polynomial time, for every positive integer a*. Running binary search over integers 
a*, we can find the minimum such integer. When cq is small, this gives us a good overestimate of 
the optimal composition of the discrete input privacy parameters. This means that given any inputs 
(ei, ),..., (cfe, 4), 4 OptComp, we can discretize and polynomially bound the values to new 
values e' for all i G [fc] and use Lemma 5.2 to approximate OptComp((ei, di),..., (e)., 6 k), Sg). The 
next lemma tells us that this is also a good approximation of OptComp((ei, (5i),..., (e^, 4 ), 4 )- 
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Lemma 5.3. For all ci,... ,ek,ci,... ,Ck > 0 and Si,, 5k, 5g G [0,1); 

OptComp((ei + Cl, 5i),..., (efc + Ck,Sk),Sg) < OptComp((ei, Ji),..., (e^, 4), ■ 4) + ^ 


where c = 

Next we prove the three lemmas and then show that Theorem 1 1. 7| follows. 


Proof of Lemma 5.1 We modify Dyer’s algorithm for approximately counting solutions to knap¬ 


sack problems |Dye03| . The algorithm uses dynamic programming. Given non-negative integers 
ai,..., Ok, B, and weights wi,... ,Wk € Q, define 


Fir, s)= Wi 

SC[r-] s.t. ieS 


We want to compute F{k, B). We can find this by tabulating F{r, s) for (0 < r < fc, 0 < s < B) 
using the recursion: 

{ 1 if r = 0 

F{r — 1, s) -I- WrF{r — 1, s — a^) if r > 0 and Or < s 
F{r — 1, s) if r > 0 and Oy > s 


Each cell F{r, s) in the table can be computed in constant time given earlier cells F(r', s') where 
< r. Thus filling the entire table takes time 0{Bk). □ 


Proof of Lemma |5.2[ Given a rational 
positive integers ai,... ,ak,a* and rational 
whether or not 


> 0 and Cl = oi • cq, ... ,ek = 
4, ■ • ■ 4, 5g G [0,1) Theorem 


1.5 


Ofc • €o, e* = a* ■ Co for 
tells us that answering 


OptGomp((ei,4),...,(efc,4),4) < e* 


is equivalent to answering whether or not the following inequality holds: 


1 

nti(i+e^o 




— e 



< 1 - 


^ 4 

nil (1-4) 


( 8 ) 


The right-hand side and nli(l + F' ) are easy to compute given the inputs (note that is 
rational for all i G [k] because each is an integer power of e''°). So in order to check the inequality, 
we will show how to compute the sum. Define 


K = 




/2 
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and observe that by setting T = 5"^, we have 


E 

SC{l,...,fe} 


max “ E f ’ IT ' 


TG-ff \ \i=l 


jGT 


iGT 


We can now use Lemma |5.1| to compute each term separately since if is a set of knapsack 
solutions. Specifically, setting Wi = e“'^‘ Vi € [k], Lemma |5.l| tells us that we can compute 
Srcrfei IliGT Wi subject to which is equivalent to IliGT® compute 


HTGif IliGT we instead set Wi = and run the same procedure. (Note that = (e'^°)“ , 
which is rational.) So we can determine whether or not Inequality holds. We used the algorithm 

from Lemma js.lj so the running time is 0{Bk) = O ®i) ^ 

Proof of Lemma [53] Fix ei,..., Cfc, Ci,..., > 0 and Si,... ,5k,Sg £ [0,1) and let c = X]iG[fc] 

Let OptComp((ei, (5i),..., (e/c, Sk),e~‘^^'^ ■ 5g) = Cg. From Equationj^in Theorem 


1.5 


we know: 




E' 


Ili^l (1 + SC{l,....fc} 

Multiplying both sides by gives: 


max < e’^® — e'^® • e’^® , 0 > < 1 — 


1 — e ■ Sg 

nha-s.) 


c/2 


E- 




111=1 (1 + 6*^0 SC{1,..., 


max < e*^® — • e‘®® , 0 > < ' I ^ ~ 


1 — e ■ 5 „ 


k} 




< 1 - 


I - 5„ 


nti (1-^0 

The above inequality together with Theorem |1.5| means that showing the following will complete 
the proof: 


y^ max < e’^® — e'^®“''° • e’^® 


d . e®"-nL(l + e-+") y- 


SC{l,...,/c} k J 

Since (1 + e'^’+'^‘)/(l + e*^’) > for every ei,Ci > 0, it suffices to show: 


y^ max < e 
SC{l,....fc} I 


es 


- 6"®+"= • e‘2® 




E' 


r E- 

max < e'^® — e'^ 


E' 


0 > < y^ e'^ • max < e’^® — e*^® • e‘^® , 0 

/ SC{l,...,fe} I 


This inequality holds term by term. If a right-hand term is zero ^EigS ^ ®s + Ei^s then 

so is the corresponding left-hand term ^EigS^®* + Ci) < Cg + c -I- Ei^s(®i + • For the nonzero 

terms, the factor of e'^ ensures that the right-hand terms are larger than the left-hand terms. □ 


Proof of Theorem \l.7\ Lemma 5.2 tells us that we can determine whether a set of privacy parame¬ 
ters satisfies some (cg, Sg) differential privacy guarantee if the values and eg are all positive integer 


E- 

e‘«® ,0 
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multiples of some eg where is rational. We are given rational ei,... ,ek > 0,Si,.. .Sk,Sg € [0,1), 
and ry G (0,1). Let e = J2ie[k] be the arithmetic mean of the Ci values. Let (3 = r]/{k-{l + e) + l), 
set eg = ln(l + /3), and for all i G [k] set Oi = [ci • (1//3 + 1)] and e' = eg • at. We will use the 
following bounds on eg in the proof: 



P 

1 + /3 


< eo < /3 


With these settings, the a^’s are non-negative integers, the e' values are all integer multiples of eg 
and e*^® is rational. So for every positive integer a we can apply Lemma |5.2| to determine whether or 
not OptComp((e']^, (5i ),... 5k), 5 g) < a - eg in time O {k ■ • Running binary search over 

integers a, we can find the minimum such integer, which we will call a*. The algorithm’s estimate of 
OptComp((ei, ^i),..., (cfe, 5k), 5g) will be a* - eg. However since this number is irrational, we will use 
the Taylor approximation of the natural logarithm to output e* satisfying a*-eg < e* < a* • eg-I-/3—eg. 
Since we only need to calculate a few terms of the Taylor expansion of ln(l -|- /3) to achieve this 
approximation, this step will not affect our running time. 

Since we choose a* to be the minimum integer satisfying composition we have: 


- /3 < (a* - 1) • eg < OptComp((e'i,(5i),..., {ek,5k),5g) < a* • eg < 


a* can range from 0 to *be binary search can be done in log ^ 

iterations. This gives us a total running time of: 


O 


k^-I-{l + e) 


V 


log 


P.g^l + I) 
V 


Now we argue that e* is a good approximation of OptComp((ei, (5i),..., (e^, 5^), <5g). For all 
i G [fc] we have: 


e' = eg • a* 

- 1 + /3 
> 




So all of the e) values are overestimates of their corresponding e^ values and therefore 
OptComp((ei,(5i),...,(efc,4),5g) < OptComp((ei, (5i),..., (e'fc, 4), 5g) < e* 
satisfying one of the inequalities in the theorem. We also have for all i G [fc]: 



— f-i + P ■ {Ci + V) 


(1 -f e)/ri) 


28 









Let Ci = /3 • (ci + 1) for all i G [k] and let c = X]iG[fc] Ci = P ■ k ■ {1 + e). Now we get 


e* - (3 < OptComp((ei,(5i),...,(efc,4),(5g) 

< OptComp((ei + Cl, Si ),..., (e^ + Ck,Sk),Sg) 

< OptComp((ei, <5i),..., (efc, 4), • 4) + ^ • A: • (1 + e) 


by Lemma 5.3 
proof. 


Noting that P ■ k ■ (1 + e) and p ■ k ■ {1 + e) + P are both at most rj completes the 

□ 
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A Comparison of Composition Theorems 


The figures below compare the performances of four homogeneous composition theorems, 
figures, “Summing 


In all 
DRV” refers to ad- 


refers to basic composition - Theorem 1.2 |DKMMN06| . 
vanced composition - Theorem 1.3 |DRV10j . “KOV Bound” refers to a bound in |KOV 15j that 
is a closed form approximation of the optimal composition theorem, and “Optimal” refers to the 
optimal composition theorem - Theorem 1.4 [KOV 15] . Here we are composing k mechanisms that 
are (e, 5) differentially private to obtain an {eg,Sg) differentially private mechanism as guaranteed 
by one of the composition theorems. 


Varying Epsilon, k up to 700 


Varying Epsilon, k up to 100 




k 


k 


Figure 1: (Left) eg given by four composition theorems for varying values of e as fc grows. Parameters 
5 = 0 and Sg = 2“^®. (Right) Same plot zoomed in on the k < 100 regime. We see that optimal 
composition gives substantial savings in Cg, even for moderate values of k. 
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Varying Global Delta 


Percentage Comparison 




Figure 2: (Left) eg given by four composition theorems for varying values of 8g as k grows, with 
parameters 5 = 0 and e = .005 for the individual mechanisms. Sg does not affect eg in basic com¬ 
position. (Right) Performance of composition theorems measured relative to optimal composition. 
Depicts every curve in Figure 1 divided by the optimal composition curve. We see that relative 
performances of the KOV bound and DRV seem to converge to a constant. The eg values given by 
the KOV bound are about 20% larger than optimal and the values given by advanced composition 
are about 30-40% larger than optimal. 
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